Skip to main content

Reply to this post | See parent post | Go Back
View Post [edit]

Poster: EbbeHove Date: Jan 16, 2016 11:44am
Forum: texts Subject: Re: Removal of flagged content

Sorry to have to say this, but letting users upload blatantly mislabeled items does not seem like a professional approach to archiving.
Ebbe

Reply to this post
Reply [edit]

Poster: stbalbach Date: Jan 16, 2016 7:17pm
Forum: texts Subject: Re: Removal of flagged content

The bad/missing metadata problem has been a source of frustration among users since the beginning of the archive. On the flip side, the owners of content can rest assured no one will mess with it. So it is attractive for uploaders such as institutions. The archive is not a typical archive. Not sure what the solution is. Perhaps create a Wikidata-like parallel cataloging system anyone can edit. A catalog to the Internet Archive catalog.

Stephen

Reply to this post
Reply [edit]

Poster: EbbeHove Date: Jan 18, 2016 1:37am
Forum: texts Subject: Re: Removal of flagged content

Well, having used the IA for a number of years, I just think the problem is getting worse every day - hence the posting.
I cannot see why institutions should be interested in letting errors in uploads persist. If a community editor pointed out that a text was not in Danish but in German, why would they not be interested in having the error fixed? No credible community editor would want to "mess" with material.
Based on the very limited search referred to in the original post, the present stance of the IA reminds me of a child coming home from school with 238 boxes, declaring: "Look dad, I have just created an archive, and there are labels on every box." You look in the boxes and tell him, "Well this box says it is a Danish text, but there is a dead frog inside. Actually, 204 of these boxes are labelled totally wrong." And he will just frown and say, "Well my friends put these labels on, and I will not change them!" And then you will try to explain to him, what an archive really is ...
The idea of a Wikidata-like index might work, but I see it as a last resort, as it would require loads of copy-pasting of titles, authors and so on. What I hope for, is that the donors to the IA will demand a higher quality. The idea of preserving the pages of the Internet is brilliant, but the idea of preserving out of copyright material is not being carried out with the professionalism it deserves.
Ebbe

Reply to this post
Reply [edit]

Poster: stbalbach Date: Jan 18, 2016 10:48am
Forum: texts Subject: Re: Removal of flagged content

I agree. Probably the cost of fixing the problem internally is prohibitive for the uploading institutions. And opening it to the community would violate the original terms of upload, and other issues.

Another idea is to expand the flag feature so that the community can specify corrections on a field level basis, and these changes are forwarded to the uploading institutions in some format (JSON) on a regular basis, and they have tools which allow for mass approval so it reduces their work load. The JSON file is sent back to IA and another tool then implements the changes across all the approved flagged works in that batch. It would give the uploaders final authority over changes but the community provides free labor to fix problems.

Reply to this post
Reply [edit]

Poster: garthus1 Date: Jan 18, 2016 10:11pm
Forum: texts Subject: Re: Removal of flagged content

stbalbach,

I think EbbeHove is referring to spam and not mislabeled books necessarily. There are a good amount of materials which are being falsely uploaded and then has spam links inserted in it. This is what happens when you allow anonymous uploads. We are almost ready with the InfoPortal and much of this crap will come to an end. The research usefulness of the Archive is being impeded by such postings and apparently few at the Archive care about the integrity of their content. I have actually posted well over 3,000 items if you include multiple volume or item posts, and I respond almost immediately to information concerning problems with the metadata or descriptions of my items. Such behavior is unprofessional (the Archive allowing these materials) and should not be encouraged through allowing anonymous member postings.

Gerry

Reply to this post
Reply [edit]

Poster: stbalbach Date: Jan 19, 2016 9:52am
Forum: texts Subject: Re: Removal of flagged content

I see. Well there are multiple issues. For legitimate non-spam works, there is wrong/missing meta information eg. birth date of author, misspellings, etc.. the same mechanism that fights spam (flagging) might also be useful for solving wrong/missing metadata.

I look forward to seeing what you come up with.

Stephen

Reply to this post
Reply [edit]

Poster: PDpolice Date: Jan 19, 2016 2:45pm
Forum: texts Subject: Re: Removal of flagged content

There are many issues being discussed in this forum thread. Most of them are "wishful thinking" at best as the "Internet Archive" has shown no interest in the thoughts of file creator/up-loaders or site change suggestions and criticism by those creators.
But this mention of identification being part of the process is a subject I would like to comment on. I am against any internet activity or any machine interpreted action being attributed to a physical human. I am not my computer. The contents of my internet connected machines data storage is only partially under my control and is easily changed by outside parties. At no time can everything on the machine be vouched-safe by me. For all I know the machine is used by government agencies to broadcast spam while I am away from it. Those could easily be marked as coming from "me".
In the case of uploads to the Internet Archive, there are other drawbacks to the idea of positive identification as a requirement to upload. For instance, I am currently one of several individuals who contribute to a collection of items. Although we upload files and create pages using a template of metadata, the whole collection is not under control of any one individual. The idea of having only one "person" be able to access the collection is inefficient.
At some time this entity will no longer be able to collect, curate, or upload items. Should that be the end of the historic items I now contribute to a collection here?

Identity does not matter to ideas. And, if they do, so what? My cat accidentally typed one of these words when it walked on the keyboard. Did it have more/less meaning? For all I know this forum is the output of that room full of monkeys typewriting I used to hear so much about. And yet the words do seem to be meaningful.
I hope this translated into english properly. Those characters look wrong.

Reply to this post
Reply [edit]

Poster: garthus1 Date: Jan 19, 2016 10:03am
Forum: texts Subject: Re: Removal of flagged content

stbalbach,

I talked with them on several occasions about this problem and offered to work for free on the solution, but they were not interested. Actually it is simple to address this issue; we will be dealing with it for our Infoportal. There should be several categories of individuals whom are allowed to upload ... trusted up-loaders would have more discretion as far as the permanency of their uploads. Non-trusted uploads should be able to be modified by the staff of the Archive ... this should be clearly stated. If I did not know any better, it would appear that some people may want it this way ... guarantees employment opportunities. This should have been addressed a long time ago. All I can say is that the problem is easily solved as far as preventing this crap from being placed into the Archive in the future. Cleaning up the mess that these bozos allowed to happen is another story ... that will take some work. The real solution is to never allow this to occur in the first place ... allowing corruption of your database is serious, especially when it could have been easily avoided in most cases. All I can say, is that they could have never worked in the business that I working in for over thirty years ... they would have been vaporized or worse (if they lived). Some fields of endeavor do not give one the luxury of being foolish or stupid, the Darwin equation selects you out of the population quickly.
All we can do is try to work around these issues, but it is such a waste of resources.

Gerry

Reply to this post
Reply [edit]

Poster: stbalbach Date: Jan 20, 2016 4:11pm
Forum: texts Subject: Re: Removal of flagged content

Your a lawyer I believe so you probably already know this -- if your organization takes an active roll in curating material the org may also be responsible for the content itself, legally, which could mean law suits for copyright violations etc. OTOH if you are simply a conduit then the responsibility is with the uploader. It's the old phone company rule that Ma Bell is not responsible for what people do on its network. I don't know all the legal ins and outs.

Stephen

Reply to this post
Reply [edit]

Poster: garthus1 Date: Jan 20, 2016 6:40pm
Forum: texts Subject: Re: Removal of flagged content

stbalbach,

This is true ... however, if we require identification for posting, 98% of the crap will not come. We do not even have to touch it, just associate the crap with real people and see how fast they change their tune. We will be doing this with Infoportal ... in exchange for having a good number of free services, they will have to be identifiable, at least to the staff of our organization. I assure you, that is not what the folks at IA are concerned about. They just do not think like engineers even if some of them are programmers ... look at the infatuation with Adobe Flash ... how much user time and Archive money was wasted on that crap. And just look at the scripting on their pages ... it drains your browser and machine. We will have absolutely no or almost no browser-side scripting.

Gerry

Reply to this post
Reply [edit]

Poster: Jeff Kaplan Date: Jan 20, 2016 7:51pm
Forum: texts Subject: Re: Removal of flagged content

garthus1,
i wish you the best of luck with your site. i don't understand why you find it necessary to insult the engineers, programmers and other professionals who have worked tirelessly to create the largest free library on the web. i would appreciate your reconsidering the tone you're using here. if you dislike the archive that much you are free to take both your comments and materials elsewhere.

Reply to this post
Reply [edit]

Poster: garthus1 Date: Jan 21, 2016 7:15pm
Forum: texts Subject: Re: Removal of flagged content

Jeff,
The way great things are built is to incorporate enhancements which usually are the result of criticism which may not be liked by those responsible for the original creation. You completely misunderstand my intent here. I have said in the past and am still telling people today, that the IA site is the best one out there, especially concerning the amount and type of content they have available. If I would have felt the way you have written, I never would have spent, by my own count (which does not include any work I did for the Archive before 2010) nearly 6,000 hours (yes I do keep track of it and no I do not take any tax deductions for any of my contributions) of my own time creating content and putting it up. The time spent alone on copyright investigation is not included also and that was not insignificant.
The INFOPORTAL will not be another IA, I do not think anyone could easily replace the IA; and that would not be our intent in any case. The problems I see should, and upon which others are commenting here, really have been addressed some time ago. Offers have been made, but no one seems to be interested in contacting any of the main providers of content unless they apparently are large education or corporate organizations. The shame here is that Brewster or anyone in the administration at the Archive does not think (heh wait a minute, we have people willing to help us without charge, may be we should at least talk with them and see if their ideas have any merit) and then if you do not like what we have said, at least give good reasons for not listening. Instead we get the approach more like, 'up yours', if you do not like it leave.
No one at the Archive will yet answer the question concerning what is wrong with having two sets of web sites; I am sure that the group of us who like the old site would keep it running as long as we can without any cost to the Archive. We would even be willing to host it on our own servers if that would be a problem. So you see what the frustration is … the appearance is that these decisions appear as if they were being made arbitrarily (I am not saying they are, only their [those in power at the Archive at the moment] non-communication makes it appear that they are). Apparently egos are so that people have invested themselves personally into the decisions which they have made and have an irrational reaction when some criticism is leveled. I assure you that this criticism is made out of a love for what has been created at the Archive and I am sorry if some people's egos get hurt along the way. The people whom I work with only know one way to work, that is professionally, truthfully, and honestly. We only know two ways to do things … good and even better. Optimization should be a hallmark of any system and I am sure people could find issues with what I do also. But the difference is that I am willing to listen to that criticism and incorporate good recommendations into my creations. 'Killing the messenger' has been going I am sure since humans first walked the planet. No one is saying here that you have to do it 'my way' only that people should listen and then if you do not want to accept the recommendation, at least be honorable enough to give a valid explanation explaining why you are not willing to listen. This argument has been going on for some time now … and I will not name them at this time, but there are people who have skills which would be extremely useful for the Archive as an organization and are willing to work voluntarily. It is their and all of our losses that the 'people at the top' have become so insular that they cannot even reply to a simple question “why not let a group of us take over the old (classic) site and continue its operation, at no cost to the Archive.
WE ARE STILL WAITING FOR AN ANSWER.

Take a look at this link: http://www.openeducation.org/moodle/

My courses since 2011, I think what I am saying should have some credibility ... 74 course in multiple and different fields.

Gerry

This post was modified by garthus1 on 2016-01-22 03:15:13

Reply to this post
Reply [edit]

Poster: stbalbach Date: Jan 21, 2016 7:48pm
Forum: texts Subject: Re: Removal of flagged content

Gerry, Jeff is right. I don't understand your position. It's not your organization. Friendly feedback and ideas is one thing, but cutting remarks about the staff? No one is entitled because of their age, work experience or upload count. Whatever the technical and policy disagreements, there's got to be a better way than haranguing the forums.

I worked in an early era ISP with 10s of thousands of customers. We had support forums (pre-web usenet) and some people were continually critical of our work. It was like their full time job to post complaints, tantrums, threats, etc.. You may be on the other end of it if you have an online customer base to support. There's no glory it's a combat zone where most of the time you are the pin cushion.

Stephen

Reply to this post
Reply [edit]

Poster: garthus1 Date: Jan 29, 2016 1:01pm
Forum: texts Subject: Re: Removal of flagged content

Stbalbach,
Everything Is Fine On The Titanic?
From a customer service perspective you may have a point; however, I think many people miss the points being made here. The Archive is a non-profit organization and as such is heavily subsidized by the taxpayers whether many really understand this or not. The Archive is not a 'private' corporation since it is not exclusively funded by private money. When you agree to take nonprofit status you also have additional responsibilities to your members over and above what a private corporation has to its stock-holders. No one is trying to tell them how to run the Archive ... only asking why certain decisions were made ... and I think that is not too much to ask. Simple two line answer ... maybe 'it will cost too much' or 'we do not have the time to do this', would suffice; however the 'up yours' attitude does not cut it. I never ran nor do I run my private companies like that. We offered to give them advice or information with no strings attached, but the management seems to not want any advice unless ‘they’ pay for it ... strange coming from a so-called nonprofit organization. You would think that efforts to optimize would be welcome ... but the exact opposite is true. I remember the cork-in-the-ear approach during the adoption of flash and the resultant waste of time and resources was so predictable. Progress is not made with kumbaiya-kiss-ass-conformism as the driving force. The dialectic oft-times gets heated, but an intelligent person does not take criticism as a personal attack and understands that out of the strife of conflict comes new more adaptable and of course really 'better' systems. I am sure Brewster did not get rich by following the path of producing products or services that increasingly were harder to use and less optimal. That is not the way competition works in a free market. And unfortunately government subsidized nonprofits which are run like private country clubs do no produce optimal results. Those who refuse to listen to 'wake-up' calls will just repeat history and ultimately even though they may have a good product or service ... it will always function sub-optimally.

All this verbiage and no reason yet given why there cannot be two interfaces for the Archive ... I always had at least two interfaces to my web-sites, the front-end which put one's face on the Internet and maybe should 'look' good, whatever that means; but there was always a back end which was purely functional, since that is what we used to support the front-end. If one wants to produce better systems ... they must be willing to take and understand criticism ... without that you just have a group of kiss-ass groupies just telling each other how good everything is as the Titanic is heading towards the iceberg.

Gerry

Reply to this post
Reply [edit]

Poster: AncientAxim Date: Jan 22, 2016 7:02am
Forum: texts Subject: Re: Removal of flagged content

Thanks Jeff! The folks at Internet Archive are not paid, correct?

BUT- with all respect, I myself have noticed a whole lot of new spam, and a lot if it is in another language- and not merely "mis-posted", it is nothing but spam. It's this new, weird creepy kind of spam- I nickname it "Sustainable" spam, on account of it's bizarre meta-speak and for it's multi-lingual content as well (little joke there)

Thanks Jeff Kaplan for taking time out to respond to us, and to do what you can to remove offenders. I wish there were more spam police, though, yes this FREE but that being so, does it not make more work for YOU to have to deal with archiving junk like "PASSWORD RECOVERY MICROSOFT LOSE WEIGHT FAST BEST DERMATOLOGY DOCTOR" etc etc ad nauseam?

Reply to this post
Reply [edit]

Poster: garthus1 Date: Jan 29, 2016 11:19pm
Forum: texts Subject: Re: Removal of flagged content

Ancientaxim,

These people are paid, most of the people putting up content are not. Obviously the spammers are not paid by the Archive ... maybe by others peddling their content.

Gerry

Reply to this post
Reply [edit]

Poster: Jeff Kaplan Date: Jan 22, 2016 8:00am
Forum: texts Subject: Re: Removal of flagged content

thanks for the spam query tip. i'm always looking for help there.